Patent Mining: A Baseline Approach
نویسندگان
چکیده
For NTCIR Workshop 7 UC Berkeley participated in both IR4QA and the Patent Mining Tasks. This paper summarizes our approach to Patent Mining. Our focus was upon the US Patent collection, and our methodology was to treat patent mining as an information retrieval task and to aggregate multiple patent classifications from retrieved patent documents. The performance was relatively poor, possibly because of retrieving too many documents, or because of nonutilization of blind feedback techniques.
منابع مشابه
Development of a Patent Matching System Using a Hybrid Approach
There were many researches about applying various data mining or text mining tools to patent analysis, and there were many scholars and experts have verified the accuracy and the feasibility of those tools. However, since mining tools always tried to analyze the content using some mathematic methodology, such as linguistic algorithms, they neglect the fact that patent records are combinations o...
متن کاملAn Automated Research Paper Classification Method for the IPC system with the Concept Base
In the present paper, a classification method using the Concept Base is proposed and evaluated in the Patent Mining Task of the NTCIR-7 workshop. In this task, research papers are classified into the International Patent Classification (IPC) system. The classification enables research papers to be located on a patent map. In order to classify a paper, patent documents that are similar to the pa...
متن کاملCorporate Decision Making with Self-Organizing Patent Maps Labeled by Technical Terms and AHP
In this paper, we propose an approach for corporate decision making with self-organizing patent maps labeled by technical terms and AHP. First, we select the patent area of interest and collect pertinent patent documents in text format. Second, we extract keywords by text mining to transform patent documents into feature vectors of the companies. Third, we input the feature matrix of technical ...
متن کاملMulti-label Classification using Logistic Regression Models for NTCIR-7 Patent Mining Task
We design a multi-label classification system based on a machine learning approach for the NTCIR-7 Patent Mining Task. In our system, we employ a logistic regression model for each International Patent Classification (IPC) code that determines the IPC code assignment of research papers. The logistic regression models are trained by using patent documents provided by task organizers. To mitigate...
متن کاملConstructing a Broad-coverage Lexicon for Text Mining in the Patent Domain
For mining intellectual property texts (patents), a broad-coverage lexicon that covers general English words together with terminology from the patent domain is indispensable. The patent domain is very diffuse as it comprises a variety of technical domains (e.g. Human Necessities, Chemistry & Metallurgy and Physics in the International Patent Classification). As a result, collecting a lexicon t...
متن کامل